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The "power of choice" has been shown to radically alter the behavior of a number of randomized 
algorithms. Here we explore the effects of choice on models of tree and network growth. In our 
models each new node has k randomly chosen contacts, where k > 1 is a constant. It then attaches 
to whichever one of these contacts is most desirable in some sense, such as its distance from the root 
or its degree. Even when the new node has just two choices, i.e., when k — 2, the resulting network 
can be very different from a random graph or tree. For instance, if the new node attaches to the 
contact which is closest to the root of the tree, the distribution of depths changes from Poisson to a 
traveling wave solution. If the new node attaches to the contact with the smallest degree, the degree 
distribution is closer to uniform than in a random graph, so that with high probability there are no 
nodes in the network with degree greater than O(loglogTV). Finally, if the new node attaches to the 
contact with the largest degree, we find that the degree distribution is a power law with exponent 
— 1 up to degrees roughly equal to k, with an exponential cutoff beyond that; thus, in this case, we 
need k ^> 1 to see a power law over a wide range of degrees. 



PACS numbers: 89.75.Hc,02.50.Ey,05.40.-a 

I. FORMULATION OF THE MODEL 

Over the past decade, the "power of choice" has 
emerged as a theme in research on optimization and ran- 
domized algorithms [TJ |5J [3J 0] . Consider a random de- 
cision process. Typically at each step of the process a 
decision is reached by choosing one outcome at random 
and accepting this choice. Now, rather then one random 
alternative being presented at each decision point, let 
a small set of randomly generated alternatives be pre- 
sented, and let the best one be selected. It has been 
shown that with as few as two alternatives at each deci- 
sion point, the resulting properties of the process can be 
radically altered. This was first explored in the context 
of load-balancing the allocation of jobs arriving at ran- 
dom times to a batch of processors. With as few as two 
choices, the maximum load on any one processor drops 
dramatically from 0(\ogN) to O(loglogiV). Increasing 
the number of choices beyond two only improves this by 
a constant factor, illustrating the "power of two choices." 

Here we explore the effect of choice on random network 
growth. Perhaps the simplest way to build a growing ran- 
dom network is to attach each new node to an existing 
node which is chosen uniformly at random. This pro- 
cess generates random recursive trees which have been 
studied in great detail (see e.g. |6j 18] and references 
therein). Here we discuss a simple generalization: for 
each new node we choose k > 1 existing 'contact' nodes 
uniformly at random, select the 'best' one according to 
some definition, and connect the new node to it. This 
creates a random tree [3] whose statistics may be very 
different from those of a random recursive tree. 

We have to define, of course, the 'quality' of the node 
so that we can choose the best one. One natural defi- 
nition of quality in a tree is distance to the root — the 



closer to the root, the better, so that the new node at- 
taches to whichever one of its contacts is closest to the 
root (and, if more than one contact has this smallest dis- 
tance, we choose one of them randomly). This could cor- 
respond, for instance, to someone joining a hierarchical 
organization, and choosing to become a daughter node 
of whichever one of their k contacts is highest up in the 
hierarchy. 

Another natural definition is to measure quality by 
degree of the contact node: for instance, to attach the 
new node to the contact node with highest degree, again 
breaking ties randomly. Note that this is very different 
from the preferential attachment process [10] . where the 
contact is selected from the entire graph with probabil- 
ity proportional to its degree. This latter process requires 
complete knowledge of the degree of all existing nodes. 
In contrast, our model assumes that the new node pos- 
sesses only a small amount of local information, namely, 
the degrees of a small number of potential contacts. This 
brings us to another motivation for this work: the desire 
to understand the effects of limited, local information on 
network growth. 

For the smallest-depth model, we find a marked differ- 
ence in behavior for k > 2 versus k = 1. The measure of 
interest in this case is the depth distribution (the frac- 
tion of nodes at each depth j). For k = 1, i.e., a random 
recursive tree, this distribution is Poisson. For k > 2, 
the same Poisson distribution is observed for distances 
close to the root, however for larger distances the depth 
distribution obeys a traveling wave solution. We also con- 
sider using maximal depth, rather than minimal depth, 
as the contact node selection criterion and find a similar 
traveling-wave solution. 

For the highest-degree model, we find that the degree 
distribution decays exponentially for degree i > k. For 
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i < k the degree distribution exhibits power-law like be- 
havior, thus in order to observe a power law for any sub- 
stantial regime requires k 3> 1. In other words, a large 
amount of (overhead/state/knowledge of the system) is 
required to achieve a power law distribution. 

Finally, in analogy to the above-referenced works on 
load balancing, the lowest-degree model achieves a de- 
gree distribution which is very close to uniform, in which 
the maximum degree in the entire graph is (9(loglog N) 
as opposed to the maximum degree in a Poisson distri- 
bution, which is roughly 0(log N). 



II. SMALLEST DEPTH 

Let N be the total number of nodes and Dj (N) be the 
number of nodes at distance j from the root. By defini- 
tion, Dq(N) = 1, since the root is distance from itself. 
Thus Dq(N) is a deterministic quantity, while Dj(N) 
with 1 < j < N are random variables. We shall focus 
on their averages Qj(N) = (Dj(N)}. An average value 
provides a good description of a random variable when 
it is large and hence fluctuations arc relatively small; we 
will see that this is indeed correct for D 1 (N). 

To set the stage we begin in Sect. II A| with the sim- 
pler case of random recursive trees, for which everything 
is already known (see e.g. [H]). We then consider the 
influence of 2 or more choices in Sect. Ill Bl 



A. Random recursive trees and depth 

The quantity Dj grows each time a node at distance 
j — 1 is selected as the contact node. The average depth 
distribution thus satisfies the master equation [12] 

Qj(N + 1) - Qj(N) + I Qj-iiN) ■ (1) 

This equation is exact and it applies even for j — if we 
set Q-i(N) = 0. Using the recursive nature of 0, we 
first solve for Qi(N), then Q2{N), etc. This gives 



Q j (N + l) = 



E 



l<mi<---<mj<N 



mi X 



(2) 



Equivalcntly, we can recast the j-fold sums into simple 
sums, although the results look less neat. For example, 



h(N) = flw-i 
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where = Xa<n<j\r n P are harmonic numbers. The 
asymptotic behaviors of = ifjy , Hjp, and other 
harmonic numbers are well-known |13j , and the resulting 



asymptotics of the depth distribution are 
Qi(iV+l) = lnA + 7 + ' ' 



2N 12N 2 
(N+l) = — (In N) 2 + 7 In TV + - 
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where 7 w 0.577 is the Euler-Mascheroni constant. Anal- 
ogous results hold for Qj(N) for larger j. 

If we merely want to establish the leading asymptotic 
behavior, we can replace the summation in ([2| by inte- 
gration. This leads to the simple result 



Qj(N) 



(In NY 



(4) 



showing that in the limit TV — > 00, the depth distribu- 
tion is Poisson with mean IniV. Alternatively, we can 
derive ^ within a continuum approach by replacing fi- 
nite differences by derivatives in the N — > 00 limit of ([lj . 
This procedure recasts discrete master equations into dif- 
ferential equations 



dQj 
~dN 



1 

N 



(5) 



Solving ([5]) one recovers Q. 

The normalization requirement X)j>o Dj(N) = N im- 
plies the sum rule for the averages 



N 



(6) 



J>0 



The continuum approximation Q agrees with the sum 
rule ([2]) implying that it well approximates the depth 
distribution in the entire range. We therefore use it to 
find the depth of the recursive random tree. The depth 



is defined as the maximal j v 
leads to an estimate [11] 



The criterion Qj 



Jr. 



e In N 



(7) 



It is possible to derive this result within the exact (dis- 
crete) approach and to determine the fluctuations of 
Jmax- However, for our purposes ^ is sufficient. 

B. The model with k = 2 choices 

Now suppose the new node has k — 2 choices. In this 
case, we have Dj(N + 1) = Dj(N) + 1 if the two contact 
nodes have minimum depth j — I, or equivalently, if both 
of them have depth at least j — 1 , but if they do not both 
have depth greater than j — 1. The probability of this is 



N~ 



N~ 



E a - £a 



yi>j-l 
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(8) 
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This leads to the exact recurrence 



UN + l)=Q j (N) + N~ 



2D, 



i>j 



Unfortunately, this is not very helpful since the average of 
the product of random quantities differs from the product 
of their averages, viz. (DiDj) ^ (Di)(Dj). One can, of 



course, write down an exact recurrence for (DiDj), but n ( at\ — ^ \n ( NW 2 ^ 

this involves third order moments (D^D^Di.) . and so on. ^ ' 9 ^ ^ 9 ^ 



Thus (D\) (Di) 2 , yet the variance is asymptotically 
2 In N and therefore fluctuations of the random variable 
Di are indeed small compared to its average which grows 
as 2 In TV, see 

We determined (D 2 (N)) = Vi(N) + Q\{N) and there- 
fore Qi satisfies a closed solvable recurrence (111. The 
solution reads 

l^V2n-l^ 2 



this involves third order moments (DiDjD^), and so on. 
Thus the hierarchical nature of the governing equations 
does not allow us to obtain complete and rigorous results 
as is possible for the case k = 1. 

The cases of j = 1,2 are exceptional and one can de- 
termine Qi and Q 2 analytically. For j = I the analysis is 
especially simple since Do = 1, J2i>i Di = N—l, and the 
growth rate ^ simplifies to [l + 2(iV- l)]/N 2 . Therefore 
the average number of the neighbors of the root grows ac- 
cording to an exact and closed recurrence 



y(N+l) = Q 1 (N) + 



27V- 1 
N 2 



(9) 



Solving ^ subject to Qi(l) = yields 



JV-l 



QiW = E = 2H "-i - h n-i ( 10 ) 



Similarly for j — 2 we use relation J2i> 2 Di = N—l — Di 
and obtain 

iV-1 (D 2 (N)) 
Q 2 (N+l) = Q 2 (N) + 2^^Q 1 (N)- [ -^»- (11) 

To obtain a closed recurrence for Q 2 we need to deter- 
mine (Df(N)), the average of the square of the number 
of neighbors of the root. Then d8l) leads to 



Di(iV + l) 



Di(iV) + l prob N~ 2 (2N-l) 
D t (N) prob l-iV- 2 (2iV- 1) 



Squaring this equation and averaging we obtain 
(DfiN + l)) = (l~ 21 ^ 1 )(Dl(N)) 



2N - 1 



N 2 

-.2, 



{(D 2 (N))+2Q 1 (N) + l] 



Rather than directly solving this recurrence, we can use 
it together with 
the variance V-\ 



it together with (|9| to establish a simpler recurrence for 
\(N) = (D\(N)) - (£>i(A0) 2 - We find 



V 1 (N+l)=V 1 (N) + 



2N-1 

N 2 



2N - 1 
N 2 



(12) 



which is readily solved to give 

V^N + 1) - 2H N - + 4ff£> - 



2 

JV-l 

E 
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n=l v 



U 1 (n) + [g 1 (n)] 2 + Q 1 (n) 



For j > 3, the problem becomes genuinely hierarchi- 
cal and intractable. If we are seeking only the leading 
behavior, however, we can proceed. When TV 3> 1 and 
j is sufficiently small, namely such that J2i<j Qi ^ N, 
we can replace the sum Yli>j Di by N and the growth 
rate ([8| by 2Z? 3 „ 1 //V. Thus we arrive at a set of differ- 
ential equations 



dQj 

In 



Qj 



N 



Solving these equations we obtain 



(2 In Ny 



(13) 



(14) 



We check the validity of this approximation by substitut- 
ing it back into our assumption X)i<j Qi ^ N which we 
used in the derivation of p"3| . This suggests that ( 14 1 
holds when j < vhiN (i.e., for small distances from the 
root) where v is the smallest positive root of 



(15) 



We can write v in terms of Lambert's function W(x), 
defined as the root of We w = x: 



v = -l/W-i(-l/2e) 



(16) 



where W-i denotes the —1st branch of the Lambert func- 
tion. Numerically, v = 0.373365... 

When j > win TV we cannot use (14). However, as 



long as Qj is much larger than 1, let us assume that the 
fluctuations in Dj are small. In that case we can replace 
averages (DjDk) by QjQk, and in this regime we obtain 



N' 



Q 2 ^ 1 +2Q j _ 1 J2Qk I • (17) 

k>j 



It is convenient to introduce the cumulative variable 



* = n ^ Qi 

i>j 



(18) 
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that is, the average fraction of nodes whose depth is at 
least j. Summing (17) over all i > j we arrive at a neat 
recurrence 



dN 



Nq 3 



(19) 



The form of this equation suggests the introduction of a 
new 'time' variable 



t = ln7V 



This transformation recasts (19) into 



dt 



= -1j 



(20) 



(21) 



which should be solved subject to the step function initial 
condition: q 7 (0) = 1 for j < and qj(0) = for j > 0. 

Equation (21 1 has appeared in various contexts (see 
e.g. |14j ) and while it is unsolvable, an asymptotic be- 
havior of its solution is understood. In the long time 
limit, the solution approaches a 'traveling wave' form, 



Qj(t) -> Q(J ~ vt ) 



(22) 



Plugging (22 1 into (21 1 one finds that q(x) satisfies 
dq 



dx 



q(x) -q(x- If 



(23) 



The boundary conditions are 

q(— oo) = 1, q(+oo) = 



(24) 



The boundary-value problem (|23jk(|24j) is still intractable 
analytically. However, the velocity v can be determined 
even without a complete solution for q(x). The method 
relies on the analysis of the tail region x — > — oo. One 
notices that ( 23 ) admits an exponential solution in this 
region, 



1 — q(x) oc e 



Xx 



(25) 



Plugging this into ( 23 ) shows that the velocity v is related 



to A via the dispersion relation [14] 

1 - 2e~ A 



A 



(26) 



The maximum of v = v(X) is given by (151 and it oc- 
curs at the largest positive root A of the transcendental 
equation 2(1 + A) = e A . This is 



A = -1 - W-!(-l/2e) 



(27) 



or numerically, A = 1.67835... Comparing with (16), we 
see that A and v are related as follows, 



A = -1 + l/v 



(28) 



Strictly speaking, one can only assert that velocity 
does not exceed the maximum of (26 1. However, the 



so-called selection principle tells us that this extremal 
value is realized for any initial conditions which vanish 
sufficiently rapidly at infinity. The selection principle has 
been rigorously proven for a few nonlinear parabolic par- 
tial differential equations. Yet heuristic arguments and 
numerical evidence indicate that the its range of appli- 
cability is much broader. This is reviewed in |15j in the 
context of partial differential equations and in [16j in the 
context of difference equations. 

Thus there is a sharp front at depth jfront ~ v t = v In N 
to leading order, where the depth of most nodes in the 
tree is concentrated. Furthermore, the width of this front 
remains finite even in the limit N — > oo. It is also possible 
to compute the sub-leading correction to the position of 
the front |14j . giving an improved estimate of its location: 



.7 front 



vlnN+ — lnlnTV 
2A 



(29) 



To estimate the maximum depth j max , it is necessary to 
bound the tail of q(x) in the positive direction x — * +oo. 
To do this, note that by definition q(x) is monotonically 
decreasing, and by (23) this implies that 



q(x) < q(x - l) 2 
and therefore that this tail is doubly exponential, 



q{x) oc e 



—A-2 X 



(30) 



for some constant A > 0. Setting q(x) — 1/N then gives 
the estimate 



Jmax ^ Jfront ~t~ 



In In TV 
In 2 



(31) 



minus a constant C = In A/ In 2. As shown in Fig. [T] ( p9| 
and (31 1 are indeed excellent estimates of the average and 



maximum depth respectively. 




1 •* lfi+0B 



FIG. 1: The average depth (circles) and maximum depth 
(crosses) of a tree with k — 2, averaged over 10 3 indepdent 
trial s for each value of N, and (dashed) the expressions (291 
and (311 for jf ron t and j max respectively. 
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C. The effect of choice 

At first sight, it seems that having two choices instead 
of one does not qualitatively affect the outcome, since the 
depth distributions Q and ( 14 1 both seem Poissonian, 
and both have typical depth 0(logn). This is, however, 
an illusion. First of all, the distribution Q for random 
recursive trees is indeed Poissonian while ( 14 1 is valid 
only for j < vlnN. Secondly, while both types of trees 
have depth O(logn), choice causes the depth to be much 
more concentrated. This is easiest to see if we consider 
the cumulative depth distribution (18). For random re- 
cursive trees, qj(t) is asymptotically 



where erfc(z) is the error function 



Thus 



erfc(z) 



<u(t) = 



drje- 11 



1 j - i < -Vt 
j - t > +Vt 



(32) 



(33) 



The boundary layer where q changes from one to zero is 
not a true front as its width grows with 'time' as \[i ~ 
Vm~]V. 

On the other hand, for the model with choice the cu- 
mulative depth distribution has a traveling wave shape 
with a front of constant width. Thus 



<&(*) 



1 J-Jfront<-l 
j - Jfront > +1 



D. Multiple choices 

What if the new node has more than two choices? The 
cases with k > 3 (with k constant) are morally similar to 
the k = 2 case: the cumulative depth distribution obeys 
the differential equation 



^±--a +a k 



(34) 



Transforming this to qj(t) = q(j—vi) as before, we obtain 



(35) 



The solution is again a traveling wave, whose velocity v 
depends on k. Assuming the selection principle, v is the 
smallest positive root of 



r In | — | = I 

V 



(36) 



which can be written in terms of Lambert's function as 
v = -l/W_i(-l/fce) . (37) 
Asymptotically, as k grows we have 

I I f 1 _ (l^nk) , (3g) 



In ke + In In ke In k 



In k 



with A given by ( J28[ ) . For j <C jfront , and ([14 1 gener- 
alize to 



A more precise estimate for jf ront is again given by (J29 1 

<C jfront 

(kin Ny 



Q 3 (N) 



Finally, the tail of q(x) is doubly exponential, 

q(x) w e~ j4feX 
and the maximum depth is given by 

lnlniV 

Jmax ~ Jfront ~t~ 



ln k 



(39) 



(40) 



(41) 



III. LARGEST DEPTH 

We pause here to consider a model in which we reverse 
our definition of the 'better' node, and attach each new 
node to the contact node which is furthest from the root. 
If k = 2, then we have Dj(N + 1) = Dj(N) + 1 whenever 
the maximum depth of the two nodes is j — 1, and this 
occurs with probability 



N- 2 




N- 2 



' J-2 \ 

D 2 _ 1 +2D j _ 1 J2dA 

v i=0 / 



(42) 



For instance, the average number of the neighbors of the 
root grows according to 



l (N+l) = Q 1 (N) + 



N 2 



and therefore 



Qx(N) = H 



(2) 
JV-1 



(43) 



(44) 



Thus the average number of neighbors of the root does 
not diverge as in the smallest depth model, but instead 
approaches the constant £(2) = 7r 2 /6. Generally, the 
behavior of Qj(N) for small j is very different from (14 1, 
viz. for j = O(l) the average number of nodes of depth j 
remains finite in the N — > oo limit. Therefore in contrast 
with the smallest depth model, the quantities Dj(oo) are 
not self-averaging when j = O(l) and their averages do 
not characterize them. Yet, the probability distribution 



P(s) = Prob[L>i(oo) = s] 



(45) 
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can be determined. For instance, -Di(2) = 1 and the 
probability that the root still has one neighbor when the 
network size reaches N is 

W -! / 1 \ 

Prob[A(A0 = 1] = II l 1 "^) 



and therefore 



Proceeding with this line of reasoning one obtains 
1 ~ 1 



(46) 



P(s + 1) = 



E 



2 ^ (n? - 1) x 

2<n 1 < - <n 3 v 1 ; 



x (n^ — 1) 



which can be expressed as a sum involving the zeta func- 
tion at positive integers. 

However, even though the Dj are not self-averaging, 
there are many similarities between the smallest depth 
model and this one. In particular, the cumulative depth 



distribution has a traveling wave shape ( 22 1 . Indeed, af- 



ter several mappings [16] the model becomes identical to 
one which has appeared in studies of collision processes in 
gases [T7], fragmentation processes [IS], and other prob- 
lems [13] . If we define the cumulative variable as 



9j 



TV ^ 



(47) 



then writing qj(t) = q(j — vt) gives (19 1, (211 and (23) 
again, but now with the boundary conditions 



q(— oo) = 0, q(+oo) = 1 



(48) 



With these boundary conditions, (23 1 admits a solution 



whose tail in the positive direction is exponential, 

1 — q(x) oc as x — * +oo 

and the dispersion relation is now 

2e^ - 1 



(49) 



(50) 



The selection principle now suggests that v is the mini- 
mum of ( 50 1 . This is the larger of the two real roots of 



the transcendental equation (15), which is v — 4.31107... 



A more precise estimate of jf r0 nt is 
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Jfront = V In AT - 



2/y 



In In AT 



(51) 



where fj, — 0.768039... is the larger root of 2(1 — fi) = e~^. 

More generally, for k > 2 the velocity v is the larger 
real root of ([36]) , or 



which, as k grows, approaches 

v « ke — 1 



The position of the front is given by (51 1 with 
H=l-l/v . 



(53) 



(54) 



Finally, since the tail of q(x) in the positive direction is 
given by (49 1, setting qj = 1 — 1/N gives the following 
estimate of the maximum depth, 



jmax = Jfront + - In TV 

A* 



(55) 



Note that, unlike the minimum depth model, Jmax - Jfront 
is O(logiV) instead of O (log log N), since the tail (49) is 
exponential rather than doubly exponential. 



IV. HIGHEST DEGREE 

We now consider a model in which quality is measured 
not by depth, but by the degree of the contact node — 
the higher the degree, the better. As we will show below, 
in this case the degree distribution exhibits a power law 
up to degree j ~ k, beyond which it decays exponen- 
tially. Therefore, in this model we need a large number 
of choices, k ^> 1, in order to observe a power law over a 
wide range of degrees. 



A. Recurrence for the degree distribution 

We start by writing a master equation for the degree 
distribution of the network. We add one node at each 
step, so at time t there are t nodes in the network. 
Let Ni(t) be the number of nodes which have degree 
i at time t, and let Ci(t) = Y^]=iNj(t) be the corre- 
sponding total number of nodes of degree i or less at 
time t. Normalizing these numbers, let di(t) = Ni(t)/t 
be the fraction of nodes which have degree i, and let 

c i(t) = J2]=i a j(t) = @i{t)/t be the corresponding cu- 
mulative distribution. 

At each iteration, we choose k contact nodes at random 
from the t existing nodes, and connect the new node to 
the contact node of highest degree, with ties broken ran- 
domly. The evolution of the expected cumulative degree 
distribution can can be written as, for all i > 1, 



d(t + 1) = d(t) + 1 - ( Ci (t) k - Cl ^{t) k ) 



(56) 



since Ci increases by 1 for each new node added, and 
decreases precisely when the new node connects to a node 
of degree i. This latter event occurs when all k nodes 
have degree i or less, but not all have degree i — 1 or less. 
Writing c,(f) = Ci(t)/t and making the assumption that 
a steady-state limit exists, we obtain the recurrence 



v = -l/Wi(-l/fce) 



(52) 



Ci 



-l) 



(57) 
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We note that in the case k = 1, where there is no choice, 



the solution to (57 1 is simply 

Cj = 1 — 2~ l and a. L = 2~ 



(58) 



which is the degree distribution of a random recursive 
tree. 



B. The model with k > 2 choices 

We are particularly interested in the behavior for small 
k. Recall that the "power of choice" comes from situa- 
tions where results vary dramatically if k = 2 rather than 
k = 1. For k > 2 we can solve ( |57| analytically only in 
the regime i 3> 1 as discussed in detail below. Yet, for 
k = 2, equation (57 1 is very easy to solve numerically as 



it reduces to the quadratic equation: 

c\ + a - (l + cti) - o. 



(59) 



Figure [2] is a plot of the degree distribution, a,, for k = 1 
and k — 2. Recall a* — Ci — Cj_i. The data points are 
from a numerical simulation with k = 2, grown to size 
1 x 10 6 nodes. Note the excellent agreement. Though the 
distribution for k — 2 decays less slowly than k — 1 both 
exhibit exponential decay, thus the nature of the solution 
is not altered with such minor amounts of choice. 

From numerical simulation with k > 2 we hnd different 
behaviors for i > k than for i < k (see Fig. [3]). For degree 
i > k we observe ~ exp(— i/k). For i < k we observe 
what appears to be a power law in that regime, a% ~ k ~ 7 , 
with 7 w 1.5. The largest k we simulated was k = 32, 
hence the "power law" regime is quite small. Rather 
than computer simulation, we can look at the asymptotic 



limits of (57) and arrive at these similar results in the 
limit i » 1 and k 1. Note, the asymptotic limit 
will give 7=1, and we can attribute the difference with 
numerical results to hnitc size effects in simulation. 



C. Asymptotic limits 
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FIG. 2: The degree distribution, 04, for the highest degree 
model, for both k — 1 and k — 2. The points at data from 
numerical simulation of the model with k = 2. 
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3: Numerical simulation results for k = 16. Note that 
< k we observe ~ i~ ' 5 , while for i > fc we observe 



In the asymptotic regime i > 1 we write Cj = 1 — 
and assume that q <C 1. To hrst order, c, fe = 1 — fcti. 



Simplifying (57l, we hnd (fc+ l)ej = fce.j_i and therefore 

jfe 



1 - Q = 



1 



when i > 1, 



(60) 



where Ak is a constant depending on k. We argue below 
that 



as k 



(61) 



In the rest of this section we always assume that k ^> 1 . 
Let us start with nodes of degree one (which are often 
called 'leaves'). In this case we have c\ = a\ and equa- 
tion (57 1 reduces to 



a 1 



1 



(62) 



Writing 



a>\ = 1 



W 
k 



(63) 



and assuming that W <C k yields a\ = e ^ . This allows 
us to recast ( 62 1 into 



We 



w 



(64) 



so IF is Lambert's function W(k). For large k, we have 
IF(/c) ~ In A;, justifying our assumption that W <C k. 
Thus almost all nodes are leaves: the fraction of nodes 
whose degree exceeds one is 1 — a\ = W(k)/k s» (In k) /k. 
Analyzing (57 1 for i = 2, 3, . . . one finds that the fol- 
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lowing ansatz is useful: 

d = 1 



W-Wi 

~~k 



Plugging ( 65 I into ( 57 1 we obtain 

1 + e^" 1 - e Wi = W~ l wi 



Since W — ► oo as k — ► oo, Eq. (66 1 simplifies to 

1 + e™ 4 - 1 - e™ 4 = 



(65) 



(66) 



(67) 



let Ni(t) be the number of nodes which have degree i 
at time t, and now let C,(i) = X)j>i^Y?M be the cor- 
responding total number of nodes of degree i or greater 
at time t. Normalizing, let a,i(i) — Ni/t and let Ci(t) = 
J2j>i a j(t) — Ci(t)/t be the complementary cumulative 
distribution. 

At each iteration, we choose k contact nodes at ran- 
dom from the t existing nodes, and connect the new node 
to the contact node of lowest degree, with ties broken 
randomly. The evolution of the expected complementary 
cumulative degree distribution can can be written, for all 
i > 1, as 



whose solution (satisfying w\ = 0) is Wi = lni. Plugging 
this to (65 1 we find that = Cj — Cj_i is given by 



C i (t+l) = C i (t)+[c i . 1 (t) k -c i (t) k 



(72) 



AT 1 In 



when 2 < i -C k 



(68) 



The upper bound i <C /c is necessary since we can use ( 67 ) 



instead of ( 66 1 only when Wi -C W which is equivalent to 



lni <C lnfc. Note that we can further simplify (68 1 when 
i 3> 1, viz. 



since Ci increases precisely when the new node connects 
to a node of degree i — 1. This event occurs when all 
k nodes have degree i — 1 or greater, but not all have 
degree i or greater. Writing Cj(t) = d(t)/t and making 
the assumption that a steady-state limit exists, we obtain 
the recurrence 



1 1 



r-r k -r k 



(73) 



when 1 <C i <SC fc 



(69) 



We note that in the case A; = 1, where there is no choice, 



Thus up to a crossover at i = k, the degree distribution 
exhibits an algebraic behavior aj ~ i^ 1 with unusually 
small exponent. 



the solution to ( 73 1 is simply 

5 = 2" ( ^ 1} and a, = 2" 



(74) 



The derivation of @ actually holds when i » fc. Us- wMch; as pi )] is the degree distr ibution of a random 



ing (|60|) we compute ai — Ci — Ci-i to give 

when i ^> k (70) 



recursive tree. 



k-'A k 



k 

k + l 



B. The model with k > 2 choices 



The regions of the validity of (69) and (70) do not for- 



mally overlap. It is reasonable to assume, however, that 
they remain qualitatively correct. Then from Eq. (69) 
we obtain ~ fc~ 2 while Eq. (70) leads to a k ~ k~ 1 A k . 



For k = 2, (73 1 is very easy to solve numerically as it 



reduces to the quadratic equation: 



Matching this values we confirm the announced asymp- 
totic of the amplitude, Eq. (61 1. Furthermore, we find 







(75) 



k + l 



e l/k when 1 < k < 



LOWEST DEGREE 



(71) 



There are situations where one wants to ensure that all 
nodes have low degree, for instance consider the case of 
load-balancing discussed in Sec. [1} Thus the final variant 
we consider is when an incoming node connects to the 
target node of lowest degree. 



A. Recurrence for the degree distribution 



Figure [4] is a plot of the degree distribution, a*, for k = 1 
and k = 2. Recall here, — Ci — Cj+i. The data points 
are from a numerical simulation with k = 2, grown to 
size 1 x 10 6 nodes. Note the excellent agreement. With 
minor choice, the degree distribution is radically altered. 

For all k > 2 we can show the upper bound on the 
maximum degree is Q (log log N) using a method similar 
to that in [TI5]. From (73), for i > 3 we obtain the upper 
bound, 



Ci < cf 



and by recursion: 



< c?-i < c 2 



K 



(76) 



where K — fc^~ 2 '. Since c 2 < 1, 5j decreases doubly- 
exponentially. To find i max , the typical largest degree 
present after addition of N nodes, we set c, = 1/N. Solv- 
ing this relation we find: 



As in Sec. |IV[ we begin by writing the master equa- 
tion for the degree distribution of the network. Again 



<log fc log 1/22 iV = 0(loglogJV). 



(77) 
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FIG. 4: The degree distribution, a%, for the lowest degree 
model, for both k — 1 and k — 2. The points at data from 
numerical simulation of the model with k = 2. 



VI. DISCUSSION 

We explore the "power of choice" in network growth 
by introducing a minimalist generalization of random re- 
cursive trees. At each decision point k > 1 choices are 
presented and the most desirable one selected. If the cri- 
teria is to minimize or maximize network depth, a small 
amount of choice has a dramatic effect. For k = 1 the 
depth distribution decays with a Poisson behavior. For 



k > 2 this Poisson decay is seen for distances close to the 
root, but for further distances, the depth distribution 
obeys a traveling wave behavior. If the criteria instead 
involves node degree, we must distinguish the maximum 
degree model from the minimum degree one. For mini- 
mum degree, choice has a dramatic effect. Going from 
k = 1 to k = 2 the degree distribution changes from geo- 
metric decay to double-exponential decay (and hence the 
maximum degree observed in the network changes from 
O (log AT) to O (log log AT)). In contrast, for maximum de- 
gree, a large number of choices, k ^> 1, must be allowed 
before a change from the k = 1 behavior is observed. The 
degree distribution decays exponentially for all small val- 
ues of k. Once » 1 a power law distribution results 
for nodes of degree i < k, while for nodes of degree i > k 
the distribution decays exponentially. 

We established many results about the depth distri- 
bution. Some of them are exact, others (namely the 
assumption that the maximum allowed value of veloc- 
ity is realized, employed at the end of Sec. 



II B I utilize 



a selection principle which is not rigorously established 
for (21 1. There is no doubt of the validity of this principle 



in a broad range of contexts, and there is firm numerical 
support of all analytical results derived herein. 
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